The Set Covering Machine
نویسندگان
چکیده
We extend the classical algorithms of Valiant and Haussler for learning compact conjunctions and disjunctions of Boolean attributes to allow features that are constructed from the data and to allow a trade-off between accuracy and complexity. The result is a generalpurpose learning machine, suitable for practical learning tasks, that we call the set covering machine. We present a version of the set covering machine that uses data-dependent balls for its set of features and compare its performance with the support vector machine. By extending a technique pioneered by Littlestone and Warmuth, we bound its generalization error as a function of the amount of data compression it achieves during training. In experiments with real-world learning tasks, the bound is shown to be extremely tight and to provide an effective guide for model selection.
منابع مشابه
Multigranulation single valued neutrosophic covering-based rough sets and their applications to multi-criteria group decision making
In this paper, three types of (philosophical, optimistic and pessimistic) multigranulation single valued neutrosophic (SVN) covering-based rough set models are presented, and these three models are applied to the problem of multi-criteria group decision making (MCGDM).Firstly, a type of SVN covering-based rough set model is proposed.Based on this rough set model, three types of mult...
متن کاملA set-covering formulation for a drayage problem with single and double container loads
This paper addresses a drayage problem, which is motivated by the case study of a real carrier. Its trucks carry one or two containers from a port to importers and from exporters to the port. Since up to four customers can be served in each route, we propose a set-covering formulation for this problem where all possible routes are enumerated. This model can be efficiently solved to optimality b...
متن کاملSingle Assignment Capacitated Hierarchical Hub Set Covering Problem for Service Delivery Systems Over Multilevel Networks
The present study introduced a novel hierarchical hub set covering problem with capacity constraints. This study showed the significance of fixed charge costs for locating facilities, assigning hub links and designing a productivity network. The proposed model employs mixed integer programming to locate facilities and establish links between nodes according to the travel time between an origin-...
متن کاملA Local Branching Approach for the Set Covering Problem
The set covering problem (SCP) is a well-known combinatorial optimization problem. This paper investigates development of a local branching approach for the SCP. This solution strategy is exact in nature, though it is designed to improve the heuristic behavior of the mixed integer programming solver. The algorithm parameters are tuned by design of experiments approach. The proposed method is te...
متن کاملThe Set Covering Machine with Data-Dependent Half-Spaces
We examine the set covering machine when it uses data-dependent half-spaces for its set of features and bound its generalization error in terms of the number of training errors and the number of half-spaces it achieves on the training data. We show that it provides a favorable alternative to data-dependent balls on some natural data sets. Compared to the support vector machine, the set covering...
متن کاملLearning with the Set Covering Machine
We generalize the classical algorithms of Valiant and Haussler for learning conjunctions and disjunctions of Boolean attributes to the problem of learning these functions over arbitrary sets of features; including features that are constructed from the data. The result is a general-purpose learning machine, suitable for practical learning tasks, that we call the Set Covering Machine. We present...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 3 شماره
صفحات -
تاریخ انتشار 2002